

WHAT IS CLAIMED IS:

1. For use in a processor having an at least four-wide  
2 instruction issue architecture, a mechanism for pipeline processing  
3 multiply-accumulate instructions with out-of-order completion,  
4 comprising:

5 a multiply-accumulate unit (MAC) having an initial multiply  
6 stage and a subsequent accumulate stage; and

7 out-of-order completion logic, associated with said MAC, that  
8 causes interim results produced by said multiply stage to be stored  
9 when said accumulate stage is unavailable and allows younger  
10 instructions to complete before said multiply-accumulate  
11 instructions.

2. The mechanism as recited in Claim 1 wherein said initial  
2 multiply stage and said subsequent accumulate stage are single  
3 clock cycle stages.

3. The mechanism as recited in Claim 1 wherein said out-of-  
2 order completion logic is contained in a writeback stage of a  
3 pipeline in said processor.

4. The mechanism as recited in Claim 1 wherein said out-of-  
2 order completion logic writes back said interim results to at least  
3 one register in said MAC before said multiply-accumulate  
4 instructions arrive at said accumulation stage of said MAC.

5. The mechanism as recited in Claim 1 wherein said interim  
2 results are unavailable to an external program executing in said  
3 processor.

6. The mechanism as recited in Claim 1 wherein grouping  
2 logic within said processor groups said multiply-accumulate  
3 instructions based on said mechanism.

7. The mechanism as recited in Claim 1 wherein said  
2 processor is a digital signal processor.

8. For use in a processor having an at least four-wide  
2 instruction issue architecture, a method of pipeline processing  
3 multiply-accumulate instructions with out-of-order completion,  
4 comprising:

5 providing a multiply-accumulate unit (MAC) having an initial  
6 multiply stage and a subsequent accumulate stage;

7 causing interim results produced by said multiply stage to be  
8 stored when said accumulate stage is unavailable; and

9 allowing younger instructions to complete before said  
10 multiply-accumulate instructions.

9. The method as recited in Claim 8 wherein said initial  
2 multiply stage and said subsequent accumulate stage are single  
3 clock cycle stages.

10. The method as recited in Claim 8 wherein said causing is  
2 carried out in a writeback stage of a pipeline in said processor.

11. The method as recited in Claim 8 wherein said causing  
2 comprises writing back said interim results to at least one  
3 register in said MAC before said multiply-accumulate instructions  
4 arrive at said accumulation stage of said MAC.

12. The method as recited in Claim 8 wherein said interim  
2 results are unavailable to an external program executing in said  
3 processor.

13. The method as recited in Claim 8 further comprising  
2 grouping said multiply-accumulate instructions based on said  
3 mechanism.

14. The method as recited in Claim 8 wherein said processor  
2 is a digital signal processor.

15. A digital signal processor (DSP), comprising:

2       a pipeline having stages and capable of processing multiply-  
3       accumulate instructions;

4       an instruction issue unit containing grouping logic and at  
5       least four-wide instruction issue logic;

6       a multiply-accumulate unit (MAC), coupled to said instruction  
7       issue logic, having an initial multiply stage and a subsequent  
8       accumulate stage; and

9       out-of-order completion logic, associated with said pipeline,  
10      that causes interim results produced by said multiply stage to be  
11      stored when said accumulate stage is unavailable and allows younger  
12      instructions to complete before said multiply-accumulate  
13      instructions.

16. The DSP as recited in Claim 15 wherein said initial

2       multiply stage and said subsequent accumulate stage are single  
3       clock cycle stages.

17. The DSP as recited in Claim 15 wherein said out-of-order

2       completion logic is contained in a writeback stage of said  
3       pipeline.

18. The DSP as recited in Claim 15 wherein said out-of-order  
2 completion logic writes back said interim results to at least one  
3 register in said MAC before said multiply-accumulate instructions  
4 arrive at said accumulation stage of said MAC.

19. The DSP as recited in Claim 15 wherein said interim  
2 results are unavailable to an external program executing in said  
3 DSP.

20. The DSP as recited in Claim 15 wherein said grouping  
2 logic groups said multiply-accumulate instructions based on said  
3 mechanism.